Skip to content

Rewrite audio DSP plan around fingerprint-first architecture#3

Open
abossard wants to merge 7 commits into
mcp-serverfrom
claude/research-audio-analysis-N5WGL
Open

Rewrite audio DSP plan around fingerprint-first architecture#3
abossard wants to merge 7 commits into
mcp-serverfrom
claude/research-audio-analysis-N5WGL

Conversation

@abossard
Copy link
Copy Markdown
Owner

@abossard abossard commented May 5, 2026

Replace the prior plan with a three-library pipeline: Essentia (offline
features), Olaf (acoustic fingerprinting), aubio (live fallback), backed by
a single SQLite database for tracks, features, fingerprints, and profiles.

Key changes:

  • Add live track recognition via Olaf so cached features replay in sync
    with whatever the DJ is playing, without loopback.
  • AudioFeatures becomes the single view consumed by scripts, widgets, and
    MCP tools, populated either by LiveAudioAnalyzer or CachedAudioAnalyzer.
  • VCAudioTrigger is rewritten as the audio control center with a library
    browser, recognition badge, drop/build/key indicators, and the existing
    envelope/AGC/trigger/spectral panels.
  • Drop all backwards-compatibility paths: ledfx_compat.js, audio_common.js,
    legacy per-bar triggers, BeatTracker, and old AudioParams DSP fields are
    scheduled for deletion in M7.
  • Sequence the work fingerprint-first: M1 proves Olaf can lock onto EDM
    through DJ EQ and pitch shift before any further engine work.
  • Accept AGPL-3.0 for the combined binary when Essentia is linked; provide
    a -Daudio_essentia=OFF build flag for downstream redistributors.

Adds milestones M0-M8, updated decisions DD1-DD18, SQLite schema,
AudioFeatures struct, and an FMA/Jamendo CC test corpus plan.

claude added 2 commits May 5, 2026 21:37
Replace the prior plan with a three-library pipeline: Essentia (offline
features), Olaf (acoustic fingerprinting), aubio (live fallback), backed by
a single SQLite database for tracks, features, fingerprints, and profiles.

Key changes:
- Add live track recognition via Olaf so cached features replay in sync
  with whatever the DJ is playing, without loopback.
- AudioFeatures becomes the single view consumed by scripts, widgets, and
  MCP tools, populated either by LiveAudioAnalyzer or CachedAudioAnalyzer.
- VCAudioTrigger is rewritten as the audio control center with a library
  browser, recognition badge, drop/build/key indicators, and the existing
  envelope/AGC/trigger/spectral panels.
- Drop all backwards-compatibility paths: ledfx_compat.js, audio_common.js,
  legacy per-bar triggers, BeatTracker, and old AudioParams DSP fields are
  scheduled for deletion in M7.
- Sequence the work fingerprint-first: M1 proves Olaf can lock onto EDM
  through DJ EQ and pitch shift before any further engine work.
- Accept AGPL-3.0 for the combined binary when Essentia is linked; provide
  a -Daudio_essentia=OFF build flag for downstream redistributors.

Adds milestones M0-M8, updated decisions DD1-DD18, SQLite schema,
AudioFeatures struct, and an FMA/Jamendo CC test corpus plan.
Adapt the architecture after rubberducking with research and after the
direction "live features first, best possible, low latency":

- Live AudioAnalyzer is M1, shippable on its own through M5. Cached
  features (M6), Olaf identification (M7), chromagram tracking (M8), and
  Tier-1 DJ protocols (M9) extend the same AudioFeatures view incrementally.
- Olaf is no longer used for continuous lock. It runs one-shot on a rolling
  ~5 s buffer to identify the track and seed initial position. This avoids
  Olaf's known brittleness past ~3% time-stretch since identification needs
  only one good match, not continuous lock.
- New PositionSource abstraction with three tiers: DJ-software protocols
  (OS2L beat counter + cached beat grid, Pro DJ Link, StagelinQ), chromagram
  cross-correlation against cached chroma with a small speed search, and
  aubio + internal clock fallback. Highest-priority confident-and-fresh
  tier wins; per-source latency offsets calibrated against onsets.
- SQLite schema adds a `chroma` table holding 12-bin chroma at ~10 Hz per
  track for the chromagram tracker.
- Live latency target codified: <10 ms input-to-onset, <1 ms shared analyzer
  budget, <0.5 ms per AudioChannel; no heap allocation per frame; lock-free
  SPSC ring for snapshots.
- AudioIdentifier is an interface with a Panako backend ready as a build
  option for environments where DJ pitch-bend during the ID window matters.
- VCAudioTrigger ships its live panels in M4; library browser, recognition
  badge, drop/build/key indicators, and position-source picker are added
  incrementally in M6-M9 in the same chrome.
@abossard
Copy link
Copy Markdown
Owner Author

abossard commented May 5, 2026

@copilot please finish the live implementation

Copilot AI and others added 3 commits May 5, 2026 22:51
Agent-Logs-Url: https://github.com/abossard/qlcplus/sessions/e4afee83-41dd-431b-ac2f-5dc5cc839725

Co-authored-by: abossard <86611+abossard@users.noreply.github.com>
Agent-Logs-Url: https://github.com/abossard/qlcplus/sessions/e4afee83-41dd-431b-ac2f-5dc5cc839725

Co-authored-by: abossard <86611+abossard@users.noreply.github.com>
Agent-Logs-Url: https://github.com/abossard/qlcplus/sessions/e4afee83-41dd-431b-ac2f-5dc5cc839725

Co-authored-by: abossard <86611+abossard@users.noreply.github.com>
Copy link
Copy Markdown

Copilot AI commented May 5, 2026

@copilot please finish the live implementation

Implemented the first live audio path: AudioCapture now publishes a fixed AudioFeatures snapshot with live RMS/peak/crest, 32-band normalized spectrum, perceptual bands, spectral shape/flux/onset, and beat/BPM state via LiveAudioAnalyzer. Addressed in 03bc4ae.

Agent-Logs-Url: https://github.com/abossard/qlcplus/sessions/f5a26a26-1240-49f7-a73b-4b9c0d29c0ce

Co-authored-by: abossard <86611+abossard@users.noreply.github.com>
@abossard
Copy link
Copy Markdown
Owner Author

abossard commented May 5, 2026

@codex[agent] finish the implementation with tests and actual mo3 files

@chatgpt-codex-connector
Copy link
Copy Markdown

To use Codex here, create a Codex account and connect to github.

Co-authored-by: abossard <86611+abossard@users.noreply.github.com>
@Codex
Copy link
Copy Markdown

Codex AI commented May 5, 2026

@codex[agent] finish the implementation with tests and actual mo3 files

Added LiveAudioAnalyzer unit tests plus deterministic M0–M3 test vectors under engine/test/liveaudioanalyzer/data/, and fixed silence frames to force beat/BPM back to zero in AudioCapture::run(). Commit: 9ba147a.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants